05. 评估数据
评估数据
使用 Pandas,在下面的 Jupyter notebook 中探索
winequality-red.csv
和
winequality-white.csv
,以回答notebook 下面关于数据集特征的练习问题:
- 每个数据集中的样本数
- 每个数据集中的列数
- 具有缺少值的特征
- 白葡萄酒数据集中的重复行
- 每个数据集中的质量等级唯一值的数量
- 红葡萄酒数据集的平均密度
这些数据最初来自 此处 。
课程 Workspace 注意事项
在本课程中会有一些工作区的练习,帮助你无需本地搭建环境完成项目。如果你想要在本地完成项目,可以点击工作区左上角的 “jupyter” 按钮,会出现本工作区中存储的所有文件。你可以通过 upload 或 download 上传和下载工作区的文件。
在后面的一些练习题中,是有可能会用到之前练习题输出的结果的。如果你看到 “FileNotFoundError” 的提示,你需要将生成的数据从你的工作区中下载下来,然后对应上传到需要的工作区中。
如果在完成习题中遇到问题,你可以提问助教或者在论坛提问,或者发邮件给support@youdaxue.com。
Workspace
This section contains either a workspace (it can be a Jupyter Notebook workspace or an online code editor work space, etc.) and it cannot be automatically downloaded to be generated here. Please access the classroom with your account and manually download the workspace to your local machine. Note that for some courses, Udacity upload the workspace files onto https://github.com/udacity , so you may be able to download them there.
Workspace Information:
- Default file path:
- Workspace type: jupyter
- Opened files (when workspace is loaded): n/a
QUESTION:
红葡萄酒样本有多少个?
SOLUTION:
NOTE: The solutions are expressed in RegEx pattern. Udacity uses these patterns to check the given answer
QUESTION:
白葡萄酒样本有多少个?
SOLUTION:
NOTE: The solutions are expressed in RegEx pattern. Udacity uses these patterns to check the given answer
QUESTION:
每个数据集中有多少列?
SOLUTION:
NOTE: The solutions are expressed in RegEx pattern. Udacity uses these patterns to check the given answer
SOLUTION:
- 这些特征都没有缺失值
QUESTION:
白葡萄酒数据集中有多少个重复行?
SOLUTION:
NOTE: The solutions are expressed in RegEx pattern. Udacity uses these patterns to check the given answer
SOLUTION:
不必要QUESTION:
红葡萄酒数据集中有多少唯一的质量值?
SOLUTION:
NOTE: The solutions are expressed in RegEx pattern. Udacity uses these patterns to check the given answer
QUESTION:
白葡萄酒数据集中有多少唯一的质量值?
SOLUTION:
NOTE: The solutions are expressed in RegEx pattern. Udacity uses these patterns to check the given answer